Discrete Cepstrum Coefficients as Perceptual Features

نویسندگان

  • Wim D'haes
  • Xavier Rodet
چکیده

Cepstrum coefficients are widely used as features for both speech and music. In this paper, the use of discrete cepstrum coefficients is considered, which are computed from sinusoidal peaks in the short time spectrum. These coefficients are very interesting as features for pattern recognition applications since they allow to represent spectra by points in a multidimensional vector space. A new Mel frequency warping method is proposed that allows to compute the spectral envelope on the Mel scale which, by contrast to current estimation techniques, does not rely on manually set parameters. Furthermore, the robustness and perceptual relevance of the coefficients are studied and improved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modified Perceptual Linear Prediction Liftered Cepstrum (MPLPLC) Model for Pop Cover Song Recognition

Most of the features of Cover Song Identification (CSI), for example, Pitch Class Profile (PCP) related features, are based on the musical facets shared among cover versions: melody evolution and harmonic progression. In this work, the perceptual feature was studied for CSI. Our idea was to modify the Perceptual Linear Prediction (PLP) model in the field of Automatic Speech Recognition (ASR) by...

متن کامل

Comparison of Speech Features on the Speech Recognition Task

In the present work we overview some recently proposed discrete Fourier transform (DFT)and discrete wavelet packet transform (DWPT)-based speech parameterization methods and evaluate their performance on the speech recognition task. Specifically, in order to assess the practical value of these less studied speech parameterization methods, we evaluate them in a common experimental setup and comp...

متن کامل

Frequency warping and robust speaker verification: a comparison of alternative mel-scale representations

Accuracy of speaker verification is high under controlled conditions but falls off rapidly in the presence of interfering sounds. This is because spectral features, such as Mel-frequency cepstral coefficients (MFCCs), are sensitive to additive noise. MFCCs are a particular realization of warped-frequency representation with low-frequency focus. But there are several alternative, potentially mor...

متن کامل

Spectral Subband Centroids as Complementary Features for Speaker Authentication

Most conventional features used in speaker authentication are based on estimation of spectral envelopes in one way or another, e.g., Mel-scale Filterbank Cepstrum Coefficients (MFCCs), Linear-scale Filterbank Cepstrum Coefficients (LFCCs) and Relative Spectral Perceptual Linear Prediction (RASTA-PLP). In this study, Spectral Subband Centroids (SSCs) are examined. These features are the centroid...

متن کامل

Noise-Robust Speech Features Based on Cepstral Time Coefficients

In this paper, we investigate the noise-robustness of features based on the cepstral time coefficients (CTC). By cepstral time coefficients, we mean the coefficients obtained from applying the discrete cosine transform to the commonly used mel-frequency cepstral coefficients (MFCC). Furthermore, we apply temporal filters used for computing delta and acceleration dynamic features to the CTC, res...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003